Feat: Select backend devices via arg ( +add RPC backend support) by stduhpf · Pull Request #1184 · leejet/stable-diffusion.cpp

stduhpf · 2026-01-09T21:12:16Z

The main goal of this PR is to improve user experience in multi-gpu setups, allowing to chose which model part gets sent to which device.

Cli changes:

Adds the --main-backend-device [device_name] argument to set the default backend
remove --clip-on-cpu, --vae-on-cpu and --control-net-cpu arguments
replace them respectively with the new --clip_backend_device [device_name], --vae-backend-device [device_name], --control-net-backend-device [device_name] arguments
add the --diffusion_backend_device (control the device used for the diffusion/flow models) and the --tae-backend-device
add --upscaler-backend-device, --photomaker-backend-device, and --vision-backend-device
add --list-devices argument to print the list of available ggml devices and exit.
add --rpc argument to connect to a compatible GGML rpc server

C API changes (stable-diffusion.h):

Change the content of the sd_ctx_params_t struct.
void list_backends_to_buffer(char* buffer, size_t buffer_size) to write the details of the available buffers to a null-terminated char array. Devices are separated by newline characters (\n), and the name and description of the device are separated by \t character.
size_t backend_list_size() to get the size of the buffer needed for void list_backends_to_buffer
void add_rpc_device(const char* address); connect to a ggml RPC backend (from llama.cpp)

The default device selection should now consistently prioritize discrete GPUs over iGPUs.

For example if you want to run the text encoders on CPU, you'd need to use --clip_backend_device CPU instead of --clip-on-cpu

TODO:

Clean up logs

Important: to use RPC, you need to add -DSD_RPC=ON to the build. Additionally it requires ~~either sd.cpp to be built with -DSD_USE_SYSTEM_GGML flag (I haven't tested that one), or~~ the RPC server to be built with -DCMAKE_C_FLAGS="-DGGML_MAX_NAME=128" -DCMAKE_CXX_FLAGS="-DGGML_MAX_NAME=128" (default is 64)

Fixes #1116

stable-diffusion.cpp

wbruna · 2026-01-09T22:15:30Z

Maybe the backend #if tests on model.cpp, upscaler.cpp, etc. should be changed to runtime tests, too?

Also: how hard would it be to support more than one backend with the same sd.cpp binaries - Vulkan and CUDA, for instance?

stduhpf · 2026-01-09T22:22:25Z

Maybe the backend #if tests on model.cpp, upscaler.cpp, etc. should be changed to runtime tests, too?

Good point.

Also: how hard would it be to support more than one backend with the same sd.cpp binaries - Vulkan and CUDA, for instance?

I think removing those #if tests and figuring out a way to build GGML with multipl backends should be enough?

Edit: Actually I'm not sure if the #if tests in model.cpp are necessary at all. I could still build with Vulkan enabled when removing those.

wbruna · 2026-01-09T22:46:54Z

Edit: Actually I'm not sure if the #if tests in model.cpp are necessary at all. I could still build with Vulkan enabled when removing those.

I believe it's leftover code. The SD_USE_FLASH_ATTENTION one on ggml_extend.h seems obsolete, too.

common.hpp (top one), qwen_image.hpp and z_image.hpp are trickier: they test for Vulkan for precision issues. z_image.hpp also has a #if GGML_USE_HIP for the same reason (this one is my fault 🙂).

stduhpf · 2026-01-09T22:52:30Z

I'm pretty sure ggml has runtime checks for the backend type. It would probably be better to use that instead.

CarlGao4 · 2026-01-10T14:14:56Z

So sd.cpp actually support multi-backend builds? Like SYCL+CUDA at the same time?

stduhpf · 2026-01-10T16:42:10Z

@CarlGao4 I'm not sure. I never sucessfully managed to build sd.cpp with multiple backends, but ggml should be able to handle that. I got it to build with both Vulkan and RPC, but it failed to send data to the RPC server, so I don't know if it would work with other backends (I had to add a way to connect to the RPC server via the CLI).

Edit: actually RPC works if GGML_MAX_NAME is set to the same value for both sd-cli and rpc-server

leejet · 2026-01-11T08:23:16Z

Support for multiple different backends can be achieved for third-party callers simply by switching the DLL/SO that supports the desired backend.

wbruna · 2026-01-11T20:57:17Z

--list-devices working on Linux, both on Vulkan and ROCm. But I'm getting a bit of garbage at the end of the Vulkan output:

Details

stduhpf · 2026-01-12T12:52:31Z

@wbruna I got to reproduce the garbage at the end only once in my many tests. I'm not sure what's going on there.

stduhpf · 2026-01-12T16:16:14Z

I've just realized I accidentaly included my RPC-related changes in the last commit. Since it's somewhat related should I leave them in, or should I keep that for a follow-up PR?

Edit: I'm leaving them in

…ckend support))

Cyberhan123 · 2026-02-08T04:22:40Z

@wbruna I think it's a good idea to use https://github.com/cpm-cmake/CPM.cmake to manage third-party dependencies in your PR.
This allows us to fully utilize the pre-configured cmake install in ggml and correctly manage third-party libraries such as json and zip.
I think if you're unwilling to implement this in this PR, I can open a new PR to implement it if others agree.
cc @leejet

wbruna · 2026-02-08T10:46:46Z

@Cyberhan123 , you meant @stduhpf :-) But I think that could be interesting regardless of this PR.

stduhpf · 2026-02-16T14:22:05Z

RPC backend doesn't seem to handle parallel tensor loading very well. Right now I've added a workaround to disable multi-threading when loading tensors with any RPC backend enabled, but maybe this could be reworked so only tensors that need to be sent to RPC are loaded sequentially.

fix sdxl conditionner backends fix sd3 backend display

Cyberhan123 · 2026-03-26T01:28:42Z

examples/common/common.hpp

+    std::string main_backend_device;
+    std::string diffusion_backend_device;
+    std::string clip_backend_device;
+    std::string vae_backend_device;
+    std::string tae_backend_device;
+    std::string control_net_backend_device;
+    std::string upscaler_backend_device;
+    std::string photomaker_backend_device;
+    std::string vision_backend_device;


It's best to use ggml_backend_t because this avoids losing primitives. For third-party callers like myself, we can directly use ggml_backend_init_best to get the best backend.

I believe leejet does not want anything ggml-related in stable-diffusion.h nor in the examples. It should all be abstracted away. That being said, we could maybe just create a sd_backend_device as a wrapper for ggml_backend_t, and expose the functions to get the backend from name or select best backend?

Cyberhan123 · 2026-03-26T01:29:30Z

include/stable-diffusion.h

+    const char* main_device;
+    const char* diffusion_device;
+    const char* clip_device;
+    const char* vae_device;
+    const char* tae_device;
+    const char* control_net_device;
+    const char* photomaker_device;
+    const char* vision_device;


Some as use ggml_backend_t

Cyberhan123 · 2026-03-26T01:45:34Z

src/qwen_image.hpp

+#ifdef SD_USE_VULKAN
+            if(ggml_backend_is_vk(ctx->backend)){
+                to_out_0->set_force_prec_f32(true);
+            }
+#endif


In this case, we delegated the abstraction to ggml, allowing it to dynamically compile the backend., so we don't need ：

#ifdef SD_USE_VULKAN #include "ggml-vulkan.h" #endif

We can add a helper function，

// test if the backend is a specific one, e.g. "CUDA", "ROCm", "Vulkan" etc. static inline bool sd_backend_is(ggml_backend_t backend, const std::string& name) { ggml_backend_dev_t dev = ggml_backend_get_device(backend); if (!dev) return false; std::string dev_name = ggml_backend_dev_name(dev); return dev_name.find(name) != std::string::npos; }

I don't really like the idea of using string comparisons for this kind of things, but I guess it does make things simpler in that case.

Cyberhan123 · 2026-03-26T01:45:53Z

src/common_block.hpp

+        #ifdef SD_USE_VULKAN
+            if(ggml_backend_is_vk(ctx->backend)){
+                net_2->set_force_prec_f32(true);
+            }
+        #endif


Cyberhan123 · 2026-03-26T01:46:12Z

src/qwen_image.hpp

+#ifdef SD_USE_VULKAN
+#include "ggml-vulkan.h"
+#endif
+


Cyberhan123 · 2026-03-26T01:47:48Z

src/z_image.hpp

+#if GGML_USE_HIP
+            // Prevent NaN issues with certain ROCm setups
+            if (ggml_backend_is_cuda(ctx->backend)) {
+                out_proj->set_scale(1.f / 16.f);
+            }
+#endif


As well as here, We can solve the judgment problem through abstraction instead of including fixed backend header files.

Cyberhan123 · 2026-03-26T01:50:12Z

CMakeLists.txt

 option(SD_CUDA                       "sd: cuda backend" OFF)
 option(SD_HIPBLAS                    "sd: rocm backend" OFF)
 option(SD_METAL                      "sd: metal backend" OFF)
 option(SD_VULKAN                     "sd: vulkan backend" OFF)
 option(SD_OPENCL                     "sd: opencl backend" OFF)
 option(SD_SYCL                       "sd: sycl backend" OFF)
 option(SD_MUSA                       "sd: musa backend" OFF)


In this case, I mean that after we completely eliminate the backend header file inclusion, we can directly use the GGML definition.

That would be out of scope for this PR I guess, but maybe.

loci-dev mentioned this pull request Jan 9, 2026

UPSTREAM PR #1184: Feat: Select backend devices via arg auroralabs-loci/stable-diffusion.cpp#14

Open

wbruna reviewed Jan 9, 2026

View reviewed changes

stable-diffusion.cpp Outdated Show resolved Hide resolved

stduhpf force-pushed the select-backend branch from 094ac2e to 350df04 Compare January 9, 2026 21:59

stduhpf force-pushed the select-backend branch 3 times, most recently from 513d9bf to ef30512 Compare January 26, 2026 00:34

stduhpf force-pushed the select-backend branch 3 times, most recently from 0f3750f to 29e8399 Compare January 28, 2026 20:06

wbruna mentioned this pull request Jan 30, 2026

[Feature] Add arguments for selecting GPU to run on #1240

Closed

loci-dev mentioned this pull request Feb 2, 2026

UPSTREAM PR #1184: Feat: Select backend devices via arg auroralabs-loci/stable-diffusion.cpp#40

Open

stduhpf force-pushed the select-backend branch from 29e8399 to 2d43513 Compare February 3, 2026 10:32

stduhpf changed the title ~~Feat: Select backend devices via arg~~ Feat: Select backend devices via arg ( +add RPC backend support) Feb 3, 2026

roj234 added a commit to roj234/stable-diffusion.cpp that referenced this pull request Feb 3, 2026

Merge leejet#1184 (Feat: Select backend devices via arg ( +add RPC ba…

0740be6

…ckend support))

wbruna mentioned this pull request Feb 7, 2026

Replace init_backend with ggml_backend_init_best #1259

Open

stduhpf mentioned this pull request Feb 15, 2026

Support for RPC Backend #392

Open

stduhpf force-pushed the select-backend branch from 2d43513 to 3f62282 Compare February 16, 2026 14:20

stduhpf and others added 22 commits March 19, 2026 13:57

Select backend devices via arg

d1a5564

fix build

53be44f

show backend device description

5de8649

CLI: add --list-devices arg

08befec

null-terminate even if buffer is too small

ce0ebd2

move stuff to ggml_extend.cpp

fc8312e

--upscaler-backend-device

5c05507

use diffusion_backend for loading LoRAs

4bf3b74

--photomaker-backend-device (+fixes)

f410280

--vision-backend-device

10828c9

check backends at runtime

18dbbc1

fix missing includes

0775cab

fix typo

b974266

multiple clip backend devices

469b550

fix sdxl conditionner backends fix sd3 backend display

update help message

3c01070

Add RPC documentation

a36cb4d

update docs

0f7128a

update RPC docs

1506612

fix apply_loras_immediately when using different non-CPU backends

4e8efa4

Force sequencial tensor loading when using RPC

80af877

fix build

c9ff85a

Get first stage backend for loading loras

2e07c95

stduhpf force-pushed the select-backend branch from 79f3aa5 to 2e07c95 Compare March 19, 2026 13:30

Fix lora loading when using multiple clip backends

4fb8901

stduhpf mentioned this pull request Mar 19, 2026

[Bug] strange regression causing 'sepia hazed' images #1353

Open

wbruna mentioned this pull request Mar 21, 2026

Selection of the main GPU LostRuins/koboldcpp#2056

Closed

Cyberhan123 suggested changes Mar 26, 2026

View reviewed changes

Cyberhan123 mentioned this pull request Mar 26, 2026

feat: support ggml backend select and enhance the build system #1368

Open

loci-dev mentioned this pull request Mar 28, 2026

UPSTREAM PR #1368: feat: support ggml backend select and enhance the build system auroralabs-loci/stable-diffusion.cpp#94

Open

Conversation

stduhpf commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

wbruna commented Jan 9, 2026

Uh oh!

stduhpf commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wbruna commented Jan 9, 2026

Uh oh!

stduhpf commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CarlGao4 commented Jan 10, 2026

Uh oh!

stduhpf commented Jan 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leejet commented Jan 11, 2026

Uh oh!

wbruna commented Jan 11, 2026

Uh oh!

stduhpf commented Jan 12, 2026

Uh oh!

stduhpf commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Cyberhan123 commented Feb 8, 2026

Uh oh!

wbruna commented Feb 8, 2026

Uh oh!

stduhpf commented Feb 16, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Cyberhan123 Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

stduhpf commented Jan 9, 2026 •

edited

Loading

stduhpf commented Jan 9, 2026 •

edited

Loading

stduhpf commented Jan 9, 2026 •

edited

Loading

stduhpf commented Jan 10, 2026 •

edited

Loading

stduhpf commented Jan 12, 2026 •

edited

Loading

Cyberhan123 Mar 26, 2026 •

edited

Loading